Strong Scaling of Matrix Multiplication Algorithms and Memory-Independent Communication Lower Bounds

نویسندگان

Grey Ballard

James Demmel

Olga Holtz

Benjamin Lipshitz

Oded Schwartz

چکیده

A parallel algorithm has perfect strong scaling if its running time on P processors is linear in 1/P , including all communication costs. Distributed-memory parallel algorithms for matrix multiplication with perfect strong scaling have only recently been found. One is based on classical matrix multiplication (Solomonik and Demmel, 2011), and one is based on Strassen’s fast matrix multiplication (Ballard, Demmel, Holtz, Lipshitz, and Schwartz, 2012). Both algorithms scale perfectly, but only up to some number of processors where the inter-processor communication no longer scales. We obtain a memory-independent communication cost lower bound on classical and Strassen-based distributedmemory matrix multiplication algorithms. These bounds imply that no classical or Strassen-based parallel matrix multiplication algorithm can strongly scale perfectly beyond the ranges already attained by the two parallel algorithms mentioned above. The memory-independent bounds and the strong scaling bounds generalize to other algorithms. ACM Classification

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Communication-Optimal Parallel and Sequential Eigenvalue/SVD Algorithms

Algorithms have two costs: arithmetic and communication, by which we mean either moving data between levels of a memory hierarchy (in the sequential case) or over a network connecting processors (in the parallel case). The simplest metric of communication is to count the total number of words moved (also called the bandwidth cost). On current hardware the cost of moving a single word already gr...

متن کامل

Communication Bounds for Heterogeneous Architectures

As the gap between the cost of communication (i.e., data movement) and computation continues to grow, pursuing algorithms which minimize communication has become a critical research objective. Toward this end, we seek asymptotic communication lower bounds for general memory models and classes of algorithms. Recent work has established lower bounds for a wide set of linear algebra algorithms (wh...

متن کامل

Communication Avoiding (CA) and Other Innovative Algorithms

In 1981 Hong and Kung proved a lower bound on the amount of communication (amount of data moved between a small, fast memory and large, slow memory) needed to perform dense, n-by-n matrix-multiplication using the conventional O(n) algorithm, where the input matrices were too large to fit in the small, fast memory. In 2004 Irony, Toledo and Tiskin gave a new proof of this result and extended it ...

متن کامل

Minimizing Communication in Numerical Linear Algebra

In 1981 Hong and Kung proved a lower bound on the amount of communication (amount of data moved between a small, fast memory and large, slow memory) needed to perform dense, n-by-n matrix-multiplication using the conventional O(n3) algorithm, where the input matrices were too large to fit in the small, fast memory. In 2004 Irony, Toledo and Tiskin gave a new proof of this result and extended it...

متن کامل

A Graph Expansion and Communication Costs of Fast Matrix Multiplication

The communication cost of algorithms (also known as I/O-complexity) is shown to be closely related to the expansion properties of the corresponding computation graphs. We demonstrate this on Strassen’s and other fast matrix multiplication algorithms, and obtain the first lower bounds on their communication costs. In the sequential case, where the processor has a fast memory of size M , too smal...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

CoRR

دوره abs/1202.3177 شماره

صفحات -

تاریخ انتشار 2012

Strong Scaling of Matrix Multiplication Algorithms and Memory-Independent Communication Lower Bounds

نویسندگان

چکیده

منابع مشابه

Communication-Optimal Parallel and Sequential Eigenvalue/SVD Algorithms

Communication Bounds for Heterogeneous Architectures

Communication Avoiding (CA) and Other Innovative Algorithms

Minimizing Communication in Numerical Linear Algebra

A Graph Expansion and Communication Costs of Fast Matrix Multiplication

عنوان ژورنال:

اشتراک گذاری